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Abstract: In the second half of the 1990s Christian Mauduit and Andras Sárkózy [86] 
introduced a new quantitative theory of pseudorandomness of binary sequences. 
Since then numerous papers have been written on this subject and the original theory 
has been generalized in several directions. Here I give a survey of some of the most 
important results involving the new quantitative pseudorandom measures of finite bi- 
nary sequences. This area has strong connections to finite fields, in particular, some 
of the best known constructions are defined using characters of finite fields and their 
pseudorandom measures are estimated via character sums. 


Keywords: Pseudorandomness, Well Distribution, Correlation, Normality 


2010 Mathematics Subject Classifications: 11K45 


Katalin Gyarmati: Department of Algebra and Number Theory, Eötvös Loránd University, Budapest, 
Hungary, e-mail: gykati@cs.elte.hu 


1 Introduction 


In the twentieth and twenty-first centuries various pseudorandom objects have been 
studied in cryptography and number theory since these objects are widely used in 
modern cryptography, in applications of the Monte Carlo method and in wireless 
communication (see [39]). Different approaches and definitions of pseudorandom- 
ness can be found in several papers and books. Menezes, Oorschot and Vanstone [95] 
have written an excellent monograph about these approaches. The most frequent- 
ly used interpretation of pseudorandomness is based on complexity theory; Gold- 
wasser [38] has written a survey paper about this approach. However, recently the 
complexity theory approach has been widely criticized. One problem is that in this 
approach usually infinite sequences are tested while in the applications only finite 
sequences are used. Another problem is that most results are based on certain un- 
proved hypotheses (such as the difficulty of factorization of integers). Finite pseu- 
dorandom [0, 1) sequences have been studied by Niederreiter and others (see, for 
example, [103-106]). Niederreiter [107] also studied random number generation and 
quasi-Monte Carlo methods and their connections. 
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In the second half of the 1990s, Christian Mauduit and Andras Sárközy [86] intro- 
duced a new constructive approach, in which the pseudorandomness of finite binary 
sequences is well characterized, and they also constructed binary sequences (and 
later other pseudorandom objects) with strong pseudorandom properties. In order to 
characterize the pseudorandomness of binary sequences Mauduit and Sárkózy intro- 
duced new quantitative pseudorandom measures. Although earlier certain statistical 
tests (see, for example, [95]) already existed and one could determine whether a se- 
quence passes these tests or not, the pseudorandom properties of the sequence were 
not classified. We also mention that by using these tests it was possible to test a se- 
quence after generating it (a posteriori testing), but we did not have any a priori result 
which guaranteed the applicability of the sequence before generating it. There are 
two fundamental problems with a posteriori testing. Firstly, it could be quite lengthy 
to check whether or not a sequence passes these tests and it is much faster if certain 
properties of the construction guarantee that these tests are always passed for cer- 
tain theoretical reasons (a priori testing). Secondly, in the case of a posteriori testing 
we always test only one certain, very special property of the sequence and nothing is 
known about the other pseudorandom properties. By using the pseudorandom mea- 
sures of Mauduit and Sárkózy it is possible to control several pseudorandom proper- 
ties of sequences and it is also possible to measure their quality. In [118] Rivat and 
Sárkózy estimated the outcome of certain basic statistical tests by the pseudorandom 
measures W and C, (see Section 2 below; the precise definitions of these tests can be 
found, for example, in [95]). In [122] Sárkózy gave a survey of this new constructive 
theory of pseudorandomness. In the present survey we will focus mostly on pseu- 
dorandom measures; we will study the most important properties of these measures 
and their connections with other cryptographic tools. 


2 Definition of the Pseudorandom Measures 


In [86] Mauduit and Sárkózy introduced the following pseudorandom measures in 
order to study the pseudorandom properties of finite binary sequences: 


Definition 2.1. For a binary sequence Ey = (e1,...,ewN) € {-1, +1} of length N, 
write 


t 
U (EN, t, a, b) = > Ca+jb 
j=0 


Then the well-distribution measure of Ey is defined as 


, 


W(Ey) = max |U(Ey,t,a, b)| = max 
a,b,t a,b,t 


t 
> Ca+jb 
j=0 


where the maximum is taken over all a,b,t such that a,b,t € Nandl < a < 
at+tb<N. 
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The well-distribution measure studies how close are the frequencies of the +1’s 
and —1’s in arithmetic progressions (for a binary sequence with strong pseudoran- 
dom properties these two quantities are expected to be very close). But often it is also 
necessary to study the connections between certain elements of the sequence. For 
example, if the subsequence (+1, +1) occurs much more frequently than the sub- 
sequence (—1, —1), it may cause problems in the applications, and we cannot say 
that our sequence has strong pseudorandom properties. In order to study connec- 
tions of this type Mauduit and Sárkózy [86] introduced the correlation and normality 
measures: 


Definition 2.2. For a binary sequence Ey = (e1,...,ew) € {-1,+1}% of length N 
and for D = (d3,...,d4) with non-negative integers 0 x dı < --- < dy, write 


Then the correlation measure of order £ of Ex is defined as 


M 
> €n^«di ana €n^«d, , 
n=1 


Ce(En) = max |V(En,M,D)| = max 
M,D M,D 


where the maximum is taken over all D = (d4,...,d4) and M such that 0 x dı < 
eO <de <M+de<N. 


Definition 2.3. For a binary sequence Ey = (e1,...,ew) € {-1,+1}% of length N 
and for X = (x1,...,x4) € (—1, +1}? write 


TUN; MA) = [40 s m «M, (oui Cus bie Ee = XT o 
Then the normality measure of order £ of Ex is defined as 


Ne(En) = max T (Ey, M, X) —M/2°|, 


where the maximum is taken over all X = (x1,...,x4) € {-l, 4-114, and M such 
thatO0<M<N-f4+1. 


We remark that infinite analogs of the functions U, V and T have been studied be- 
fore (see, for example, [19, 66] and [111]), but the quantitative analysis of pseudoran- 
dom properties of finite sequences started with the work of Mauduit and Sárkózy [86]. 

The combined (well-distribution correlation) pseudorandom measure [86] is 
a common generalization of well-distribution and correlation measures. This mea- 
sure has an important role in the multidimensional extension of the theory of pseu- 
dorandomness (see Section 9). 
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Definition 2.4. For a binary sequence Ey = (e1,...,ew) € {-1,+1}% of length N 
and for D = (d4,...,d 4) with non-negative integers 0 x dı < --- < d, write 
t 
Z(Eņy,a,b,t,D) = Y. Cas pde Ca+jb+dg 
j-0 
Then the combined (well-distribution correlation) measure of order £ of Ew is defined 
as 


t 
(EN) = max |Z(En,a,b,t,D = max | £a i elati alg 
Qe(En) max | (EN )| m 2 a+jb+dy a+jb+dy 
where the maximum is taken over all a, b, t and D = (d,,...,d¢) such that all the 


subscripts a + jb + dj belong to (1,2,..., N}. 


When introducing their quantitative pseudorandom measures, the starting point 
of Mauduit and Sárkózy was to balance the requirements possibly optimally. They 
decided to introduce functions that are real-valued and positive, and the pseudoran- 
dom properties of the sequence are characterized by the sizes of the values of these 
functions. It was also an important requirement that one should be able to present 
constructions for which these measures can be estimated well. It turned out that the 
measures W and C, do not only satisfy these criteria, but later Rivat and Sárközy [118] 
showed that if the values of W and Cy are “small”, then the outcome of many (previ- 
ously used a posteriori) statistical tests is guaranteed to be (nearly) positive. 

Although by W, Ce, Ng and Q ; many pseudorandom properties of the sequence 
can be characterized, obviously not all of them can. For example, in [45] the sym- 
metry measure was introduced in order to study symmetry properties of finite binary 
sequences (later the symmetry measure was generalized by Sziklai [125]). In [135] 
Winterhof gave an excellent survey on different pseudorandom measures and certain 
constructions. This is a fast developing area and many papers have been published; 
there are too many to list all of them here. However, introducing more and more pseu- 
dorandom measures, can make it quite lengthy to handle all these measures. Thus it 
is important to determine a not too large set of certain basic pseudorandom measures, 
which can guarantee the adequate security in the applications. The present research 
shows that the measures described in this section satisfy these criteria. The most 
studied measures are W and Cy, and many papers use only these measures. 

In the next section we will show that for a random-type sequence (i.e. for a se- 
quence with strong pseudorandom properties) the well-distribution and correlation 
measures are expected to be small. 


3 Typical Values of Pseudorandom Measures 


In [16] Cassaigne, Ferenczi, Mauduit, Rivat and Sárkózy formulated the following 
principle: “The sequence Ey is considered a ‘good’ pseudorandom sequence if these 


Bereitgestellt von | De Gruyter / TCS 
Angemeldet 
Heruntergeladen am | 16.10.19 13:24 


Measures of Pseudorandomness — 47 


measures W (Ey) and Cp(En) (at least for ‘small’ £) are ‘small’.” Indeed, the secu- 
rity of many cryptographic schemes is based on the property that the frequencies of 
the —1’s and +1’s are about the same in certain “regular” subsequences of the used 
pseudorandom binary sequence Ey € {-1,+1}%. 

In [18] Cassaigne, Mauduit and Sárkózy proved that for the majority of the se- 
quences Ey € {—1, +1} the measures W (Eyn) and Cp(En) are around N"? (up to 
some logarithmic factors). Later Alon, Kohayakawa, Mauduit, Moreira and Ródl [5] 
improved on these bounds: 


Theorem 3.1. Suppose that we choose each Ey € {—1,+1}N with probability 1/2. 
For all £ > 0 there exist No = No(€) and 6 = 6(€) > 0 such that for N > No we have 


P (SVN < W(Ey) < $/N) > 1-e. 


Theorem 3.2. Suppose that we choose each Ey € {-1,+1}% with probability 1/2. 
Then for all O < £ < 1/16 there is a constant No = No(€) such that for N > No we 


have 
P (&N tog (7) < Ce(En) < Z |N log (x) ) >l=e: 


We remark that while it is important that for a binary sequence with strong pseu- 
dorandom properties these measures should be “small”, lower bounds are not re- 
quired (this will be justified by the results of Section 4, where the minimum values 
of these measures are studied). In many applications it is enough to guarantee that 
W (En) and Cp(En) are o(N), but for the best constructions Ey € (—1,-1]" it is 
proved that W (Ey) « N!/?log N, Cp(En) « N!/?(log N)“¢ (see Section 6). 


4 Minimum Values of Pseudorandom Measures 
Write 


m(N) = min W(Ew), M(N) = min Ce(En) . 
En €{-1,+1} En€{-1,+1}4 
The estimate of m (N) is a classical problem. In 1964 Roth [119] proved that m(N) > 
N1/^, Upper bounds for m(N) were given by Sárközy [32] and Beck [9]. Finally Ma- 
touSek and Spencer [78] showed that m(N) « N!/4, 

The value of Mp(N) depends on the value of the order £. Cassaigne, Mauduit 
and Sárközy [18] proved that Mp(Ey) « (£N log N)!/?. The results of [5] improved 
the implied constant factor (see Theorem 3.2 in the previous section). On the other 
hand, first Cassaigne, Mauduit and Sárközy [18] proved that M(N) > log(N/#) for 
even 4. This was improved considerably by Alon, Kohayakawa, Mauduit, Moreira and 
Ródl in [4] and [67], where the best lower bound is the following: 


Bereitgestellt von | De Gruyter / TCS 
Angemeldet 
Heruntergeladen am | 16.10.19 13:24 


48 — Katalin Gyarmati 


Theorem 4.1. If 4 is even then 


no EAST. 

The proof of the theorem used deep linear algebraic tools, and later Anan- 
tharam [7] simplified the proof, but he obtained a slightly (by a constant factor) 
weaker result. 

Cassaigne, Mauduit and Sárkózy [18] noticed that the minimum values of correla- 
tion of odd order can be very small. Namely, for the sequence Ey = (—1,+1,-1, +1, 
...) € {-1,+1} we have Ce (Ey) = 1 for odd 4, since 


Cn+1+d1 ^ ^ " €£n41«d, = (—€n+d,) we (—en+dp) = (1 enid bna, 
Thus 
x 1 ifM is odd, 
9 enia ccr $nsd, |l1-1-«1-1-4--:| 
n=l O if M is even. 


So C,(Exy) = 1 and thus My(N) = 1 for odd £. Cassaigne, Mauduit and Sárközy [18] 
also observed that although for the sequence Ey = (-1,41,—1,-«1,...), C3(En) 
is 1, the correlation measure of order 2 is large: Co(En) = [3]. By solving prob- 
lems of Cassaigne, Mauduit and Sárkózy [18] and Mauduit [79], in [48] I proved 
that Co(EN)C3(EN) > N?’ always holds. Later Anantharam [8] proved that 
Co(EN)C3(EN) >> N. By the methods of the proofs it is possible to compare cor- 
relation measures of odd and even order. With Mauduit we proved the following 
sharp result in [51]: 


Theorem 4.2. There is a constant cp ¢ depending only on k and £ such that if 
Cok+1 (En) < c oNT?, 


then 
Cops EN) Cop (En)?! > NKL, 


where the implied constant factor depends only on k and £. 
This theorem has the following consequences: 


Corollary 4.3. If C2k+1 (En) = O(1), then Cop (EN) > N, where the implied constant 
factor depends on k and £. 


Corollary 4.4. 
Cons1 (En) Cog (En) > NED 


where the implied constant factor depends only on k and £ and where 


1 if k > Æ, 
k, £) = 
ae Ieee ifk <2. 
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The minimum of the normality measure was studied in [4] and [67], but there is 
a huge gap between the lower and upper bounds. 


5 Connection between Pseudorandom Measures 


It is a problem of basic importance to study the connections between the different 
pseudorandom measures. For example, Mauduit and Sárkózy [86] proved that the 
normality measure can be bounded by the maximum of correlation measures: 


Theorem 5.1. 


Ne(En) < max Cr(Ey) . 
lstal 


Since the normality measures can be estimated by the correlation measures, most 
of the papers do not handle the normality measures separately, just they give non- 
trivial upper bounds for the well-distribution and correlation measures. 

Cassaigne, Mauduit and Sárközy [18] compared correlation measures of different 
orders: 


Theorem 5.2. Suppose that 2 < k | £ and Ey € {-1,+1}%. Then 


Cy (Ew) < NIEH (Cg (Ey) . 


If k 1 £, itis possible to construct a sequence Ey for which Cj (Ex) is large but 
Cy (Ew) is small: 
Theorem 5.3. Suppose that 2 < k,4£ and k | £. Then there is a sequence EN € 
1-1, - 1I for which 
N 2 
Cr(En) > k 1 — 54k‘ logN , 
Co(En) < 27k? £N? logN . 


Indeed in [18], Theorem 5.2 and Theorem 5.3 were proved in a sharper form. 
The well-distribution measure can be estimated by the correlation measures 
of even order. In [92] Mauduit and Sárkózy proved that for all sequences Ey € 


{-1,+1}% we have 
W (En) < JAN C»(EN) . 


Later in [42] and [44] this inequality was generalized by me to correlation measures 
of any even order.: 


Theorem 5.4. For all sequences Ey € (—1,-- 1] we have 


1/(24) 


W (Ex) « NUCH (C5 9(Ey)) (5.1) 


In [42] I also proved that (5.1) is sharp apart from the implied constant factor. 
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6 Constructions 


First Mauduit and Sárkózy [86] studied the well-distribution and correlation measures 
of a finite binary sequence. Their construction was the following: 


Construction 6.1. Let p be a prime number, N — p — 1 and define the Legendre-se- 
quence Ey = (e3,05,...,eN) € {-1,+1}% by 


--(3) 


where ( z) denotes the Legendre symbol. 


Then by Theorem 1 in [86] for the sequence Ey defined in Construction 6.1 we 
have 
W(En) «N'?logN and Cy(Ew) <«N!/*logN. 


After their first paper [86] on pseudorandomness, Mauduit and Sárkózy contin- 
ued with a series of papers ([16—18, 87-89]) in which they tested several construc- 
tions. Since then numerous constructions have been given, see, for example, [21, 23, 
26, 28, 29, 36, 41, 71-73, 75, 82, 109, 112, 113, 116, 121]. We remark that the majority of 
these constructions are of modular type. It would be interesting to give a construc- 
tion which is not of modular type, but (nearly) optimal bounds can be proved for its 
pseudorandom measures. 

First for fixed N most constructions produced only a single sequence of length N; 
however, in many applications one needs many pseudorandom binary sequences. In 
2004 Goubin, Mauduit and Sárkózy [40] succeeded in constructing large families of 
pseudorandom binary sequences based on the Legendre symbol. Their construction 
was the following: 


Construction 6.2. Let K € N, p be a prime number and denote by P the set of poly- 
nomials f(x) € Fp[x] of degree k, where 0 < k < K and which have no multi- 
ple zero in Fp (=the algebraic closure of Fp). For f € P define the binary sequence 
Ep(f) = (e1,..., ep) by 


(£2) for (f(n),p) - 1, 
E 
+1 for p | f(n). 


(6.1) 


Let F = {Ep(f): f € P). 


Clearly F is a large family of pseudorandom binary sequences. Goubin, Mauduit 
and Sárközy [40] proved that, under some not too restrictive conditions on the poly- 
nomials f, the sequences E, (f) have strong pseudorandom properties: 


Theorem 6.3. Let p, P and F be defined as in Construction 6.2 and for f € P define 
Ep = Ep(f) € F by (6.1). Let k be the degree of f. Then 


W(Ep) « kp'?logp. 
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Moreover, assume that for Ü € N one of the following assumptions holds: 
(i) £22; 
(ii) £ < p and2 is a primitive root modulo p; 
(iii) (4k)? < p. 
Then we also have 
Ci (Ep) < klp? logp. 


We remark that several important a posteriori tests (indicated by the 1.4-sts. pack- 
age of the National Institute of Standards and Technology) were checked by Rivat and 
Sárkózy [118] by computer for many sequences generated by Construction 6.2. In each 
case they obtained that the sequence passes all these tests. 

The next construction was based on the discrete logarithm [43]: 


Construction 6.4. Let K € N, p be an odd prime number, and denote by P’ the set of 
polynomials f(x) € Fp[x] of degree k, where 0 < k x K. Let g be a primitive root 
modulo p and define indn by n = gi?d" (mod p) and 1 x indn x p — 1. For 
f € P’ define the binary sequence Ep-1 (f) = (e1,...,ep-1) by 


MES if1 <indf(n) x (p —- 1)/2 
cm -] if(p+1)/2 <indf(n) xp-lorp| f(n). 


Let F' = {Ey(f): f € P’}. 


This construction is nearly as good as Construction 6.2, the only problem is that 
it is slow to compute en, since no fast algorithm is known to compute ind n. In [44] 
this construction was slightly modified such that the sequences in the new construc- 
tion can be generated faster. Since then many other constructions of large families of 
pseudorandom sequences have been given (see, for example, [22, 24, 34, 35, 40, 43, 
44, 59, 69, 74, 81, 84, 96-98, 117, 123, 127]). 

Most constructions use finite fields and character sums over it (see the survey 
paper [127] for the most frequently used character sum estimates). One of the main 
tools in estimating the pseudorandom measures is Weil’s theorem [133]: 


Lemma 6.5. Suppose that FF, is a finite field, x is a non-principal character of order d 
over it, f € Fa[x] has s distinct roots in F4 and it is not a constant multiple of the d-th 
power of a polynomial over F4. Then: 


> x)| < (s-1)p™?. 


nel 


More precisely, the proofs of Theorem 6.3 and several other theorems (involving 
estimates of pseudorandom measures of different modular type constructions) are 
based on incomplete sums of multiplicative and additive characters. Such results can 
be derived from Weil's theorems on complete character sums (see, e.g. Lemma 6.5) by 
using a method of Vinogradov [131] (see also [64, 114, 126]). 
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Although many constructions exist, Construction 6.2 is one of the best: we have 
optimally good bounds for the pseudorandom measures and the elements of the se- 
quences can be generated fast. In the next section we will analyze structural proper- 
ties of large families of pseudorandom binary sequences. 


7 Family Measures 


In many applications it is not enough if our family 7 is large. For example, if 7 
contains many sequences but they differ only in the last few bits, then one cannot 
use more than one sequence from the family. So it is very important to guarantee 
that the family 7 has a “rich”, “complex” structure, there are many “independent” 
sequences in it which are "far apart." Thus one needs quantitative measures to study 
the structural properties of families of binary sequences. The first family measure was 
introduced by Ahlswede, Khachatrian, Mauduit and Sárkózy in [1]: 


Definition 7.1. Suppose that F is a family of binary sequences Ey = (e1,65,..., 
en) € (-1, 41)" and (€1, &,...,€;) € {-1,+1} is a fixed binary sequence of 
length j (for some j x N), andlet 1 < ij < ip < --- < ij x N. If we consider binary 
sequences Ey = (e1,e5,...,eN) € (- 1, +1} with 


Ci, SE, €i —£2, ..., C4, tj, (71) 


then (7.1) is said to be a specification of length j (of the binary sequence En). 


Definition 7.2. The family complexity or briefly f-complexity of a family 7 of bi- 
nary sequences Ey € {—1, +1} is defined as the greatest integer j such that for 
any specification (7.1) (of length j) there is at least one Ey € 7 which satisfies it. 
The f-complexity of F is denoted by T (F). (If there is no j € N with the property 
above, we set T (7) = 0.) 


Note that an easy consequence of the definition is 


Proposition 7.3. 
log | FI 


log 2 (2) 


r(F) < 


Ahlswede, Khachatrian, Mauduit and Sárközy [1] showed that for the family F 
defined in Construction 6.2, the f-complexity T (F) is large. Later Gyarmati [47] im- 
proved on their lower bound by showing that (F) > clog |F| with some explicit 
constant c; we note that by (7.2), this estimate is best possible apart from the value 
of this constant c, and thus the f-complexity of this family is optimally large (apart 
from the constant factor). Since then the family complexity of many other construc- 
tions were also studied by several authors. In [85] Mauduit and Sárkózy gave a survey 
paper on family complexity. 
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Another important tool for studying the pseudorandomness of families of binary 
sequences is the notion of collision (see, for example, [10, 95, 129, 130]): 

Assuming that N € N, S is a given set (e.g. a set of certain polynomials or the 
set of all the binary sequences of a given length much less than N), to each s € S we 
assign a unique binary sequence 


Ey = En(s) = (&1,...,eN) E€ {-1, *1)N, 
and let F = F (S) denote the family of the binary sequences obtained in this way: 
F = F(S) = {En(s): s E€ S}. (7.3) 
Definition 7.4. Ifs € S, s’ € S, s #s’ and 
Ey(s) = En(s'), (74) 


then (74) is said to be a collision in F = F (S). If there is no collision in F = F (S), 
then F is said to be collision free. 


In other words, F = F (S) is collision free if we have |F| = |S|. It turns out that 
in the best constructions, the families of pseudorandom binary sequences are colli- 
sion free. If F is not collision free but the number of collisions is “small”, then they 
may cause only minor problems in the applications. A good measure of the number 
of collisions is the following: 


Definition 7.5. The collision maximum M = M(, S) is defined by 


M = M(F,S) = max |{s: s € S, En(s) = En}! 
Ene F 


(i.e. M is the maximal number of elements of S representing the same binary se- 
quence Ey, and F = F (S) is collision free if and only if M (F, S) = 1). 


Another important family requirement is the avalanche effect (see, e.g. [10, 33, 65, 
129, 130]) which studies that by changing a few bits of the seed how many elements 
of the output sequence will change. 


Definition 7.6. If in (73) we have S = {—-1,+1}", and for any s € S, changing any 
element of s changes “many” elements of Ey (s) (i.e. for s + s' many elements of 
the sequences Ey (s) and Ew (s') are different), then we speak about an avalanche 
effect, and we say that F = 7 (S) possesses the avalanche property. If N — œ and 
for any s € S,s'€ S, s + s' atleast G — 0(1))N elements of En (s) and En(s’) are 
different, then 7 is said to possess the strict avalanche property. 


To study the avalanche property, one may introduce the following quantitative 
measure: 
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Definition 7.7. If N € N, Ey = (e1,...,en) € (-1, +1} and Ey = (e},...,ey) € 
{-1,+1}% © {-1,+1}4, then the distance d(En, Ey) between Ey and Ey is defined 
by 


d(Ey, En) = |n: 1 n € N, en #e),}| 


(a similar notion is introduced in [10]; this is a variant of the Hamming distance). 
Moreover, if F is a family of the form (7.3), then the distance minimum m(F) of F is 
defined by 
m(F) = min d(Ey (s), En(s’)) . 
ps 


Thus the family 7 in (73) is collision free if and only if m(F) > 0, and f pos- 
sesses the strict avalanche property if 


m(F) = ($-00))N. 


In [129] Tóth studied the Legendre symbol construction described in Construc- 
tion 6.2 and she showed that a variant of the family defined there (she replaced the 
condition deg f(x) < K by deg f(x) = K) is collision free if K < p!/?/2 and it pos- 
sesses the strong avalanche effect for p > oo, K = o(p!/*). In [130] she also studied 
a further construction using additive characters and she showed that there are many 
collisions in it, but a large subfamily of it possesses the strong avalanche property. 


8 Linear Complexity 


Cryptographic applications require pseudorandom sequences which are “unpre- 
dictable” in a certain sense. Kolmogorov [68] and Chaitin [20] introduced the notion 
of Kolmogorov complexity, which is roughly speaking the length of the shortest com- 
puter program which generates the given sequence in a fixed Turing machine. From 
this point of view, a sequence can be considered a bad pseudorandom sequence if its 
Kolmogorov complexity is “small”. Unfortunately, in practice, it is usually hopeless 
to compute the Kolmogorov complexity for a fixed sequence, thus this definition can- 
not be used in the applications. In this section we analyze a related measure, linear 
complexity, which is a computable measure. Mainly we will study the connection 
between linear complexity and other pseudorandom measures. 

Feedback shift registers, in particular linear feedback shift registers are used in 
many cryptographic stream ciphers (see, e.g. [95]). The linear feedback shift registers 
(LFSR) have many equivalent definitions, here I use one from [132]: 


Definition 8.1. The linear feedback shift register is a sequence of 0-1 bits (51, 55,..., 
50,C1,...,Cp) With c4 = 1. The output of the LFSR is the infinite sequence (51, 52,...) 
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where s;(€ {0,1}) fori > £ is defined by the following equation: 


Ü 
Si = » CjSi-(-14j (mod 2) . 


j=l 
An LFSR L(s1,52,..., 55, C1,..., Cf) is said to generate an infinite sequence s = 
(51, 52,...) if s is the output sequence of L(s1,52,...,50,C1,..., Cf). The linear com- 


plexity of an infinite sequence s, denoted by L (s), is defined as follows: 
(1) If s is the zero sequence (0, 0,0,...), then L(s) = 0. 
(2) If no LFSR generates s, then L(s) = oo. 
(3) Otherwise L(s) is the length of the shortest LFSR that generates s. 
For finite sequence s € {0,1}, the linear complexity L(s) is the length of the 
shortest LFSR that generates an infinite sequence whose first N bits form the finite 
sequence s. 


The relationship between linear complexity and Kolmogorov complexity was 
studied in [13, 132]. The linear complexity is an important cryptographic characteris- 
tic of sequences (see the monographs and surveys [27, 93, 95, 102, 108, 128, 134]). An 
excellent historical survey on the linear complexity is given in [115]. Here I mention 
only some of the most important properties of the linear complexity: It is known [120] 
that the linear complexity of a truly random bit sequence s = (51,52,...,5N) € 
10, 1)" is (1 + 0(1)) x. Based on this fact a sequence with low linear complexity is 
usually considered a “bad” pseudorandom sequence. 

Using the Berlekamp- Massey algorithm (which is due to Massey [77] and based 
on an earlier algorithm of Berlekamp [12]), it is possible to calculate the value of the 
linear complexity of a fixed finite sequence. The linear complexity is usually defined 
for 0 — 1 sequences (note that it can be defined similarly in the case of sequences of 
elements of F4 or Zm), but in this survey we study mostly +1 sequences. This problem 
can be easily avoided: there is a natural bijection p : (—1, --1]N — 10,1). Namely, 
if the sequence Ey € (- 1, 41] is given, then p (Ey) can be defined by 


QN) = Q((e1,02,..., eN)) = SN = (So, $1... SN-1) € (0,1) 
with s; — LA (or equivalently (C1)? = ei+1) for 120,1,...,N — 1. 
Hence we may define the linear complexity of the binary sequence Ey € (41, -1])N 
by 
L(EN) = L(p(En)) . 
Brandstátter and Winterhof [14] showed that the linear complexity of a binary 
sequence Ey can be estimated in terms of the correlation measures of the sequence: 


Theorem 8.2. If N > 2 and Ey is a binary sequence then we have 


L(EN) >N- max Ck(En) 5 
1<k<L(En)+1 
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Using this inequality they were able to give (in some cases quite strong) lower 
estimates for the linear complexity of binary sequences occurring in certain construc- 
tions. While this theorem may give quite good estimates for linear complexity, it has 
the disadvantage that it also uses correlations of high order which can be very difficult 
to estimate. Thus Andics [8] proved another inequality which uses the correlation of 
order 2 only (but it usually gives a weak lower bound): 


Theorem 8.3. If N € N and Ey is a binary sequence then we have 
2E) > N — Co(Ew) . 


Further results related to the pseudorandom measures and linear complexity can 
be found in several works (see, e.g. the papers of Winterhof and co-authors [6, 14, 15, 
25, 37, 93, 94, 124, 128, 134]). 


9 Multidimensional Theory 


In the recent years, the one-dimensional theory of pseudorandomness has been ex- 
tended to several dimensions. For example, when we would like to encrypt a digital 
map or image by the multidimensional analog of the Vernam cipher, then instead of 
a pseudorandom binary sequence we need a two or more dimensional pseudorandom 
binary lattice as a keystream. The multidimensional theory of pseudorandomness 
was developed by Hubert, Mauduit and Sárkózy [62]. They introduced the following 
definitions: 

Denote by Ix the set of n-dimensional vectors whose coordinates are integer 
numbers between 0 and N — 1: 


IN = (x = (xy,..., Xn): X1... Xn € 10, 1,..., N - 1]j. 


This set is called an n-dimensional N-lattice or briefly N-lattice. Next they extended 


this definition to more general lattices in the following way: Let u1, U2, ..., Un be n 
linearly independent vectors, where the i-th coordinate of u; is a non-zero integer, 
and the other coordinates of ui are 0, so ui is of the form (0,...,0,2zi,0,...,0). Let 
ti, 2,..., ti be integers with O < t1, t2,...,tn < N. Then we will call the set 
BY = [X = X10 +- + Xn: Xi E NU {0}, Ox xi lul < ti(« N) 
fori=1,...,n} 


an n-dimensional box N-lattice or briefly a box N-lattice. 
In [62] the definition of binary sequences is extended to more dimensions by con- 
sidering functions of type 


ex = n(x): IX — {-1, +1}. 


If X = (x1,..., X4) so that n(x) = n((X1,...,Xn)) then we will slightly simpli- 
fy the notation by writing n(x) = n(Xj,...,Xn). These functions are called bina- 
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ry N-lattices or briefly binary lattices. One may visualize a binary lattice as the lattice 
points of the N-lattice replaced by the two symbols + and —. 

In [62] Hubert, Mauduit and Sárközy introduced the following pseudorandom 
measure of binary lattices (here we will present the definition in a slightly modified 
but equivalent form): 


Definition 9.1. Let 
n: IÑ — {-1,+1} 


be a binary lattice. Define the pseudorandom measure of order £ of n by 


Qe(n) = max | > n(x+di)---n(x+dp)|, 
B,dy,...,dx xcB 
where the maximum is taken over all distinct di,...,dg € Ix and box N-lattice B 


such that B + dy,...,B+dy € Iy. 


Then n is said to have strong pseudorandom properties, or briefly, it is consid- 
ered a “good” pseudorandom lattice if for fixed n and £ and “large” N the measure 
Q¢(n) is “small” (much smaller than the trivial upper bound N”). This terminology 
is justified by the fact that, as it was proved in [62], for a truly random binary lattice 
defined on IÑ and for fixed £ the measure Q ¢(1) is “small” (less than N” /? multiplied 
by a logarithmic factor). 

Recently several multidimensional constructions have been given for lattices with 
strong pseudorandom properties, see, for example, [52, 60-62, 70, 83, 91, 99, 100]. 

Some one-dimensional theorems can be generalized to the multidimensional 
case. For example, we studied the properties of the multidimensional pseudorandom 
measures in [53—58]. In particular, in [58] we compared the one-dimensional pseudo- 
random measures with the two or more dimensional pseudorandom measures and 
we showed that the study of the multidimensional measures cannot be reduced to 
one-dimensional ones, so indeed it was necessary to develop the multidimensional 
theory. In [55-57] we introduced the multidimensional analog of the normality, corre- 
lation and symmetry measures. We studied the connection between multidimension- 
al pseudorandom measures of different orders and we proved the multidimensional 
analog of Theorem 5.1. We also studied the minimal values of the multidimensional 
pseudorandom measures. In [46] further multidimensional pseudorandom measures 
were introduced. In [53] and [54] the notions of family complexity, collision and 
avalanche effect were extended and studied in the multidimensional case. 


10 Extensions 


Pseudorandom binary sequences have many further generalizations. For example, 
Mauduit and Sárkózy [90], Ahlswede, Mauduit and Sárkózy [2, 3], Bérczi [11], Mar- 
zouk and Winterhof [76] and Mérai [101] studied the case of sequences of k symbols. 
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Hubert and Sárközy [63] studied the case of p-pseudorandom binary sequences, i.e. 
the case when the binary sequences simulate the binomial distribution of parame- 
ter p. Niederreiter, Rivat and Sárkózy [110] studied pseudorandom sequences of bi- 
nary vectors. In [30] and [31] Dartyge and Sárkózy started to study pseudorandom 
subsets of (1,2,..., N) and Zn. In [49] and [50] we studied pseudorandom binary 
functions on rooted plane trees. The connection between pseudorandom binary and 
[0, 1) sequences was analyzed in [80] by Mauduit, Niederreiter and Sárkózy. 
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